Isomorphism Distance in Multidimensional Time Series and Similarity Search
نویسنده
چکیده
Describing the similarity of time series as distance is the basis for most of data mining research. Existing studies on similarity distance is based on the ”point distance” without considering the geometric characteristics of time series, or is not a metric distance which doesn’t meet the triangle inequality and can’t be directly used in indexing and searching process. A method for time series approximation representation and similar measurement is proposed. Based on the subspace analysis representation, the time series are represented approximately with an isomorphic transformation. The basic concepts and properties of the included isomorphism distance are proposed and proved. This distance overcomes the problem when other non-metric distance is used as the similar measurement, such as the poor robustness and ambiguous concepts. The proposed method is also invariant to translation and rotation. A new pruning method for indexing in large time series databases is also proposed. Experimental results show that the proposed method is effective.
منابع مشابه
An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کاملGrid Representation of Time Series Data for Similarity Search
Widespread interest in time-series similarity search has made more in need of efficient technique, which can reduce dimensionality of the data and then to index it easily using a multidimensional structure. In this paper, we introduce a technique, which we called grid representation, based on a grid approximation of the data. We propose a lower bounding distance measure that enables a bitmap ap...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملExperiencing the Shotgun Distance for Time Series Analysis
Similarity search is a core functionality in many data mining algorithms. Over the past decade algorithms were designed to mostly work with human assistance to extract characteristic, aligned patterns of equal length and scaling. We propose the shotgun distance similarity measure that extracts, scales, and aligns segments from a query to a sample time series. This greatly simplifies the time se...
متن کاملSimilarity Search in multidimensional time series using the Coulomb's law
Due to technological innovation and lower production costs of data collecting instruments there has been a sharp increase in the amount of information available for analysis. Additionally, collected data withhold intrinsic relations within itself that cannot be realized without careful analysis, requiring the use of specific techniques to manipulate it. In this context, we propose a time series...
متن کامل